NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Ninth Workshop on Human-In-the-Loop Data Analytics (HILDA)

https://doi.org/10.1145/3722212.3724485

Chang, Remco; Rong, Kexin; Shraga, Roee (June 2025, ACM)

Free, publicly-accessible full text available June 22, 2026
CanDE: A Lightweight Locality-Sensitive Hashing Add-on for Candidate-Based Distribution Estimation

https://doi.org/10.1109/BigData62323.2024.10826065

Meng, Jingfan; Wang, Huayi; Rong, Kexin; Xu, Jun (December 2024, IEEE)

Full Text Available
Inshrinkerator: Compressing Deep Learning Training Checkpoints via Dynamic Quantization

https://doi.org/10.1145/3698038.3698553

Agrawal, Amey; Reddy, Sameer; Bhattamishra, Satwik; Nookala, Venkata_Prabhakara Sarath; Vashishth, Vidushi; Rong, Kexin; Tumanov, Alexey (November 2024, ACM)

The likelihood of encountering in-training failures rises substantially with larger Deep Learning (DL) training workloads, leading to lost work and resource wastage. Such failures are typically offset by checkpointing, which comes at the cost of storage and network bandwidth overhead. State-of-the-art approaches involve lossy model compression mechanisms, which induce a tradeoff between the resulting model quality and compression ratio. We make a key enabling observation that the sensitivity of model weights to compression varies during training, and different weights benefit from different quantization levels, ranging from retaining full precision to pruning. We propose (1) a non-uniform quantization scheme that leverages this variation, (2) an efficient search mechanism that dynamically finds the best quantization configurations, and (3) a quantization-aware delta compression mechanism that rearranges weights to minimize checkpoint differences and thereby improving compression. We instantiate these contributions in Inshrinkerator, an in-training checkpoint compression system for DL workloads. Our experiments show that Inshrinkerator consistently achieves a better tradeoff between accuracy and compression ratio compared to prior works, enabling a compression ratio up to 39x and withstanding up to 10 restores with negligible accuracy impact in fault-tolerant training. Inshrinkerator achieves at least an order of magnitude reduction in checkpoint size for failure recovery and transfer learning without any loss of accuracy.
more » « less
Full Text Available
SketchQL: Video Moment Querying with a Visual Query Interface

https://doi.org/10.1145/3677140

Wu, Renzhi; Chunduri, Pramod; Payani, Ali; Chu, Xu; Arulraj, Joy; Rong, Kexin (October 2024, Proceedings of the ACM on Management of Data)

Localizing video moments based on the movement patterns of objects is an important task in video analytics. Existing video analytics systems offer two types of querying interfaces based on natural language and SQL, respectively. However, both types of interfaces have major limitations. SQL-based systems require high query specification time, whereas natural language-based systems require large training datasets to achieve satisfactory retrieval accuracy. To address these limitations, we present SketchQL, a video database management system (VDBMS) for offline, exploratory video moment retrieval that is both easy to use and generalizes well across multiple video moment datasets. To improve ease-of-use, SketchQL features avisual query interfacethat enables users to sketch complex visual queries through intuitive drag-and-drop actions. To improve generalizability, SketchQL operates on object-tracking primitives that are reliably extracted across various datasets using pre-trained models. We present a learned similarity search algorithm for retrieving video moments closely matching the user's visual query based on object trajectories. SketchQL trains the model on a diverse dataset generated with a novel simulator, that enhances its accuracy across a wide array of datasets and queries. We evaluate SketchQL on four real-world datasets with nine queries, demonstrating its superior usability and retrieval accuracy over state-of-the-art VDBMSs.
more » « less
Full Text Available
Formation potential of disinfection byproducts during chlorination of petroleum hydrocarbon-contaminated drinking water

https://doi.org/10.1016/j.chemosphere.2024.142057

Brinkmann, Mandy-Tanita; Rong, Kexin; Xie, Yuefeng; Yan, Tao (June 2024, Chemosphere)

Full Text Available
SketchQL Demonstration: Zero-Shot Video Moment Querying with Sketches

https://doi.org/10.14778/3685800.3685892

Wu, Renzhi; Chunduri, Pramod; Shah, Dristi J; Aravind, Ashmitha Julius; Payani, Ali; Chu, Xu; Arulraj, Joy; Rong, Kexin (August 2024, Proceedings of the VLDB Endowment)

In this paper, we will present SketchQL, a video database management system (VDBMS) for retrieving video moments with a sketch-based query interface. This novel interface allows users to specify object trajectory events with simple mouse drag-and-drop operations. Users can use trajectories of single objects as building blocks to compose complex events. Using a pre-trained model that encodes trajectory similarity, SketchQL achieves zero-shot video moments retrieval by performing similarity searches over the video to identify clips that are the most similar to the visual query. In this demonstration, we introduce the graphic user interface of SketchQL and detail its functionalities and interaction mechanisms. We also demonstrate the end-to-end usage of SketchQL from query composition to video moments retrieval using real-world scenarios.
more » « less
Full Text Available
Interactive Demonstration of EVA

https://doi.org/10.14778/3611540.3611626

Kakkar, Gaurav Tarlok; Rajoria, Aryan; Kalluraya, Myna Prasanna; Raju, Ashmita; Cao, Jiashen; Rong, Kexin; Arulraj, Joy (August 2023, Proceedings of the VLDB Endowment)

In this demonstration, we will present EVA, an end-to-end AI-Relational database management system. We will demonstrate the capabilities and utility of EVA using three usage scenarios: (1) EVA serves as a backend for an exploratory video analytics interface developed using Streamlit and React, (2) EVA seamlessly integrates with the Python and Data Science ecosystems by allowing users to access EVA in a Python notebook alongside other popular libraries such as Pandas and Matplotlib, and (3) EVA facilitates bulk labeling with Label Studio, a widely-used labeling framework. By optimizing complex vision queries, we illustrate how EVA allows a wide range of application developers to harness the recent advances in computer vision.
more » « less
Full Text Available
Locality-sensitive hashing for earthquake detection: a case study of scaling data-driven science

https://doi.org/10.14778/3236187.3236214

Rong, Kexin; Yoon, Clara E.; Bergen, Karianne J.; Elezabi, Hashem; Bailis, Peter; Levis, Philip; Beroza, Gregory C. (July 2018, Proceedings of the VLDB Endowment)

Full Text Available

Search for: All records